Sequentially-Allocated Merge-Split Sampler for Conjugate and Nonconjugate Dirichlet Process Mixture Models
نویسنده
چکیده
This paper proposes a new efficient merge-split sampler for both conjugate and nonconjugate Dirichlet process mixture (DPM) models. These Bayesian nonparametric models are usually fit usingMarkov chain Monte Carlo (MCMC) or sequential importance sampling (SIS). The latest generation of Gibbs and Gibbs-like samplers for both conjugate and nonconjugate DPM models effectively update the model parameters, but can have difficulty in updating the clustering of the data. To overcome this deficiency, merge-split samplers have been developed, but until now these have been limited to conjugate or conditionally-conjugate DPM models. This paper proposes a new MCMC sampler, called the sequentially-allocated merge-split (SAMS) sampler. The sampler borrows ideas from sequential importance sampling. Splits are proposed by sequentially allocating observations to one of two split components using allocation probabilities that condition on previously allocated data. The SAMS sampler is applicable to general nonconjugate DPM models as well as conjugate models. Further, the proposed sampler is substantially more efficient than existing conjugate and nonconjugate samplers.
منابع مشابه
An Improved Merge-split Sampler for Conjugate Dirichlet Process Mixture Models
The Gibbs sampler is the standard Markov chain Monte Carlo sampler for drawing samples from the posterior distribution of conjugate Dirichlet process mixture models. Researchers have noticed the Gibbs sampler’s tendency to get stuck in local modes and, thus, poorly explore the posterior distribution. Jain and Neal (2004) proposed a merge-split sampler in which a naive random split is sweetened ...
متن کاملSplitting and Merging Components of a Nonconjugate Dirichlet Process Mixture Model
Abstract. The inferential problem of associating data to mixture components is difficult when components are nearby or overlapping. We introduce a new split-merge Markov chain Monte Carlo technique that efficiently classifies observations by splitting and merging mixture components of a nonconjugate Dirichlet process mixture model. Our method, which is a Metropolis-Hastings procedure with split...
متن کاملA Split-merge Markov Chain Monte Carlo Procedure for the Dirichlet Process Mixture Model
We propose a split-merge Markov chain algorithm to address the problem of inee-cient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an inappropriate clustering of data points. This article describes a Metropolis-Hastings procedure that...
متن کاملParallel Sampling of DP Mixture Models using Sub-Clusters Splits
We present an MCMC sampler for Dirichlet process mixture models that can be parallelized to achieve significant computational gains. We combine a nonergodic, restricted Gibbs iteration with split/merge proposals in a manner that produces an ergodic Markov chain. Each cluster is augmented with two subclusters to construct likely split moves. Unlike some previous parallel samplers, the proposed s...
متن کاملParallel Sampling of DP Mixture Models using Sub-Cluster Splits
We present an MCMC sampler for Dirichlet process mixture models that can be parallelized to achieve significant computational gains. We combine a nonergodic, restricted Gibbs iteration with split/merge proposals in a manner that produces an ergodic Markov chain. Each cluster is augmented with two subclusters to construct likely split moves. Unlike some previous parallel samplers, the proposed s...
متن کامل